Skip to content

Align tiny-Glm4MoeForCausalLM with GLM-4.5 reference config#5638

Merged
qgallouedec merged 57 commits into
mainfrom
fix-tiny-glm4-moe
May 15, 2026
Merged

Align tiny-Glm4MoeForCausalLM with GLM-4.5 reference config#5638
qgallouedec merged 57 commits into
mainfrom
fix-tiny-glm4-moe

Conversation

@qgallouedec

@qgallouedec qgallouedec commented Apr 24, 2026

Copy link
Copy Markdown
Member

What does this PR do?

On top of #5637

before:

  attention_bias                                   True                               → False
  eos_token_id                                     [151329, 151336, 151338]           → None
  first_k_dense_replace                            3                                  → 1
  head_dim                                         128                                → <missing>
  hidden_size                                      5120                               → 8
  intermediate_size                                12288                              → 32
  moe_intermediate_size                            1536                               → 1408
  n_routed_experts                                 160                                → 4
  num_attention_heads                              96                                 → 4
  num_experts_per_tok                              8                                  → 2
  num_hidden_layers                                92                                 → 2
  num_key_value_heads                              8                                  → 2
  num_nextn_predict_layers                         1                                  → <missing>
  pad_token_id                                     151329                             → None
  rope_theta                                       1000000                            → 10000.0
  routed_scaling_factor                            2.5                                → 1.0
  use_qk_norm                                      True                               → False
  vocab_size                                       151552                             → 151365

after

[config_diff] zai-org/GLM-4.5 vs tiny (10 differences)
  first_k_dense_replace                            3                                  → 1
  head_dim                                         128                                → 2
  hidden_size                                      5120                               → 8
  intermediate_size                                12288                              → 32
  moe_intermediate_size                            1536                               → 32
  n_routed_experts                                 160                                → 4
  num_attention_heads                              96                                 → 4
  num_experts_per_tok                              8                                  → 2
  num_hidden_layers                                92                                 → 2
  num_key_value_heads                              8                                  → 2

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline, Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

AI writing disclosure

We welcome the use of AI tools to help with contributions. For transparency and to help us improve our review process, please indicate the level of AI involvement in this PR.

  • No AI usage: the PR was written entirely by a human.
  • AI-assisted: some parts were suggested or improved by AI, but the PR was written and reviewed by a human.
  • AI-generated: the PR was mostly or fully generated by an AI tool.

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.


Note

Low Risk
Low risk: only updates the tiny-model generation script’s Glm4MoeConfig constants (no runtime/library code changes), but could affect downstream consumers expecting the previous tiny config.

Overview
Updates the GLM-4.5 tiny-model generation script to stop deriving vocab_size from the tokenizer and instead hardcode it, while adding several missing GLM-4.5-aligned config fields (e.g., moe_intermediate_size, head_dim, attention/eos/pad IDs, RoPE theta, scaling, QK norm, and next-token prediction layers).

This makes the generated tiny checkpoint’s config closer to the upstream reference, reducing config diffs when running print_config_diff and pushing the tiny model to the hub.

Reviewed by Cursor Bugbot for commit 1961d87. Bugbot is set up for automated code reviews on this repo. Configure here.

Comment thread tests/conftest.py Outdated
@HuggingFaceDocBuilderDev

Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 540502a8d3

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread tests/conftest.py Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 49d5fca. Configure here.

Base automatically changed from new-tiny-model-generation to main May 5, 2026 15:47
@qgallouedec qgallouedec merged commit 7c3af3d into main May 15, 2026
6 of 13 checks passed
@qgallouedec qgallouedec deleted the fix-tiny-glm4-moe branch May 15, 2026 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants